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Introduction 

The genetic code of the last common ancestor (LCA), or a minor variant of 
it, is present in all species. Its origin, in the pre-LCA era, has remained an 

enigma at the core of Molecular Biology for four decades. My analysis 
reveals that the diverse regularities observed in code structure correlate 
strongly with path-distances in amino acid synthesis. This clearly indicates 
that the code evolved by adding amino acids as they appeared, during the 
growth of synthesis pathways outward from central metabolism. 
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Synthesis Pathways Order Amino Acids and Codons 


Amino acids with short, 
medium and long paths are 
cernice ly cistrct a-cl 
encoded differently: 

(1) Four NH 4 + fixers have 
codons solely from the 
NAN column, both 
anionic residues (red) 
and 1-2 step paths. 

(2) Ten amino acids have 
alkyl, hydroxy or S- 
bearing side-chains, 
and 4-7 step paths. 
Consensus NCN, NGN 
a~d NU\ tr plets cace 
for 4-, 5- and 7-step 
residues, respectively. 

(3) Six basic (blue) and 
aromatic residues form 
on 9-14 step paths 
and are encoded rr 
by codon doublets. 

A 14-fold, exponential fall- 
off in codon assignments 
accompanied path extension 
from 4 to 14 steps, in a 
gradual freezing of the code. 
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Amino Acid Path-Distance Patterns in the Genetic Code 

• Codons for amino acids fixers 5-containing basic, aromatic an 

with 2-, 4-, 5-, and 7-step 
paths exhibit a columnwise 
ciscr b_tio~: p = 5.6xl0- 10 
NAN^NCN->NGN-sNUN 

• Codons in 7 of 8 intact 

boxes code for group 1 + 2 
amino acids (2-7 step). 
Codons in 5 of 8 subdivided 
boxes code for group 3 
residues (9-14 step). Codon 

distr b„ti0- iNC\ = \UN;: ", Dath-drstance advanced 1-step, ■ 2 

p(l+2 VS. 3) = 4.6x10-5 «* mean transfer free ener y (hydrcc 

Code Regularities Correlate with Amino Acid Path-Distances 

N-Fixers Expansion Overprinting 


Monophyletic Pre-LCA tRIMA bear Sibling Amino Acids 

] acid source: -o-KG-Pyi. OA,-PG,-Shi 
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• Positive tR',AJ L ii.es-Can:oi 
distance vs. amino acid path- 
disfance: p{P <D) = 5.2 (10"=. 

order:p = 2.05.IO- ; '. 
•tRNA amino acid identity n 
tree clusters: p = 1.36x10-". 


ft] Clusters: P = l.n4x10* 

Conserved traces of pre-LCA tRNA sequences in tree show early tRNA adaptor 
for amino acids, derived from a common precursor, diversified from a commo 
ancestor [slbl nc, adaptor": and were cognate for similar codons. 

Phases Identified in Code Evolution 


2 coding-capacity ratio 


~1Z 20*-, 


• A and B show allocation o~" stable intact code boxes coincided with assignment of G. 
enr ched triplets to small amino acids (4-6 step paths) in code formation. 

• C, decline in C oxidation number with path-distance in residues from citrate cycle. 
■ D, encoding of polar and hydrophobic residues by NAN and NUN codons, 
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respectively, conforms with colmui 


a growth at stage 2 and 7 of code formation. 


• E, triplets accumulate fastest in code boxes and slowest as single codons, 
consistent with subdivision of boxes into doublets, then doublets into singles. 

• F, elevated "id-base cod nc; caoac ty screes with most 't-?~ ■ a-ro ac ds encering 
the code during columnwise expansion through codon mid-base substitutions. 


• Amino acids with short (group 1), medium (group 2) and long (group 3) paths were 
encoded in distinct phases of code evolution: NH 4 + Fixers, Expansion, Overprinting. 

• Domains of contiguous codons read by related tRNA, with core group homology 
end tear nc; s b rg am no acids, s oread along code ro'.-.'S du'ing its forma tio-. 


Conclusions 

• Reconstruction of the path of code evolution that led to its diverse 
structural regularities has been achieved, upon equating the temporal 
order of amino acid addition to the code with the number of steps 
required for synthesis. The features of code structure unified by the path- 
distance model include, 

- Taylor-Coates amino acid size/code box rule 

- Dunnill G/C codon base/box rule 

- Dillon reductive amino acid synthesis prediction 

- Woese amino acid clusters 

- Dillon codon set subdivision rule 

- Perlwitz codon-base coding capacity ratio 
together with sixteen other regularities. 

• Bifunctional pre-LCA tRNA species, serving as cofactors in amino 
acid synthesis and adaptors in translation, were credited with 
coordinating code formation with the growth of amino acid synthesis 
pathways. 

■ Frozen within the Standard Code are vestiges of previous codes. They 
provide compelling evidence for a chemoautotrophic origin of life on a 
cationic mineral surface: 

- Residues in the first proteins were traced to a primal NH 4 + fixing 
mechanism coupled to the citrate cycle, which fixes C0 2 , 
autocatalytically, under reductive early Earth conditions. 

- Early proteins were polyanionic and they became increasingly 
hydrophobic during code expansion, consistent with prolonging their 
dwell-time on a cationic mineral surface within an aqueous system. 

• Extinction of die charge-attraction principle on completion of expansion 
phase, with subsequent incorporation of basic residues encoded by 
triplets captured in subdivided error-prone boxes, supports a 


mbrane-covered cells as the code b 


-\ the final phase of its 
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